Optionality in evaluating prosody prediction

نویسنده

  • Erwin Marsi
چکیده

This paper concerns the evaluation of prosody prediction at the symbolic level, in particular the locations of pitch accents and intonational boundaries. One evaluation method is to ask an expert to annotate text prosodically, and to compare the system’s predictions with this reference. However, this ignores the issue of optionality: there is usually more than one acceptable way to place accents and boundaries. Therefore, predictions that do not match the reference are not necessarily wrong. We propose dealing with this issue by means of a 3-class annotation which includes a class for optional accents/boundaries. We show, in a prosody prediction experiment using a memory-based learner, that evaluating against a 3-class annotation derived from multiple independent 2-class annotations allows us to identify the real prediction errors and to better estimate the real performance. Next, it is shown that a 3-class annotation produced directly by a single annotator yields a reasonable approximation of the more expensive 3-class annotation derived from multiple annotations. Finally, the results of a larger scale experiment confirm our findings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Issues of optionality in pitch accent placement

When comparing the prosodic realization of different English speakers reading the same text, a significant disagreement is usually found amongst the pitch accent patterns of the speakers. Assuming that such disagreement is due to a partial optionality of pitch accent placement, it has been recently proposed to evaluate pitch accent predictors by comparing them with multispeaker reference data. ...

متن کامل

Automatic Assessment of Prosody in High-Stakes English Tests

Prosody can be used to infer whether or not candidates fully understand a passage they are reading aloud. In this paper, we focused on automatic assessment of prosody in a read-aloud section for a high-stakes English test. A new method was proposed to handle fundamental frequency (F0) of unvoiced segments that significantly improved the predictive power of F0. The kmeans clustering method was u...

متن کامل

Joint prosody prediction and unit selection for concatenative speech synthesis

In this paper we describe how prosody prediction can be efficiently integrated with the unit selection process in a concatenative speech synthesizer under a weighted finite-state transducer (WFST) architecture. WFSTs representing prosody prediction and unit selection can be composed during synthesis, thus effectively expanding the space of possible prosodic targets. We implemented a symbolic pr...

متن کامل

Comparisons among Four Statistics Based Methods of Prosody Structure Prediction

Prosody structure prediction plays an important role in text-tospeech (TTS) conversion systems. It is the must and prior step to parametric prosody prediction. Dynamic programming (DP) and decision tree (DT) are widely used for prosody structure prediction [1][2][3] but with well-known limitations. In this paper, two other new methods, combination of dynamic programming with decision tree and c...

متن کامل

Identifying prosodic prominence patterns for English text-to-speech synthesis

This thesis proposes to improve and enrich the expressiveness of English Textto-Speech (TTS) synthesis by identifying and generating natural patterns of prosodic prominence. In most state-of-the-art TTS systems the prediction from text of prosodic prominence relations between words in an utterance relies on features that very loosely account for the combined effects of syntax, semantics, word i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004